Recursively partitioned mixture model clustering of DNA methylation data using biologically informed correlation structures.

نویسندگان

  • Devin C Koestler
  • Brock C Christensen
  • Carmen J Marsit
  • Karl T Kelsey
  • E Andres Houseman
چکیده

DNA methylation is a well-recognized epigenetic mechanism that has been the subject of a growing body of literature typically focused on the identification and study of profiles of DNA methylation and their association with human diseases and exposures. In recent years, a number of unsupervised clustering algorithms, both parametric and non-parametric, have been proposed for clustering large-scale DNA methylation data. However, most of these approaches do not incorporate known biological relationships of measured features, and in some cases, rely on unrealistic assumptions regarding the nature of DNA methylation. Here, we propose a modified version of a recursively partitioned mixture model (RPMM) that integrates information related to the proximity of CpG loci within the genome to inform correlation structures from which subsequent clustering analysis is based. Using simulations and four methylation data sets, we demonstrate that integrating biologically informative correlation structures within RPMM resulted in improved goodness-of-fit, clustering consistency, and the ability to detect biologically meaningful clusters compared to methods which ignore such correlation. Integrating biologically-informed correlation structures to enhance modeling techniques is motivated by the rapid increase in resolution of DNA methylation microarrays and the increasing understanding of the biology of this epigenetic mechanism.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-supervised recursively partitioned mixture models for identifying cancer subtypes

MOTIVATION Patients with identical cancer diagnoses often progress differently. The disparity we see in disease progression and treatment response can be attributed to the idea that two histologically similar cancers may be completely different diseases on the molecular level. Methods for identifying cancer subtypes associated with patient survival have the capacity to be powerful instruments f...

متن کامل

A comparison of cluster analysis methods using DNA methylation data

MOTIVATION Aberrant DNA methylation is common in cancer. DNA methylation profiles differ between tumor types and subtypes and provide a powerful diagnostic tool for identifying clusters of samples and/or genes. DNA methylation data obtained with the quantitative, highly sensitive MethyLight technology is not normally distributed; it frequently contains an excess of zeros. Established tools to a...

متن کامل

DNA methylation subgroups and the CpG island methylator phenotype in gastric cancer: a comprehensive profiling approach

BACKGROUND Methylation-induced silencing of promoter CpG islands in tumor suppressor genes plays an important role in human carcinogenesis. In colorectal cancer, the CpG island methylator phenotype (CIMP) is defined as widespread and elevated levels of DNA methylation and CIMP+ tumors have distinctive clinicopathological and molecular features. In contrast, the existence of a comparable CIMP su...

متن کامل

Breast Cancer DNA Methylation Profiles Are Associated with Tumor Size and Alcohol and Folate Intake

Although tumor size and lymph node involvement are the current cornerstones of breast cancer prognosis, they have not been extensively explored in relation to tumor methylation attributes in conjunction with other tumor and patient dietary and hormonal characteristics. Using primary breast tumors from 162 (AJCC stage I-IV) women from the Kaiser Division of Research Pathways Study and the Illumi...

متن کامل

Integrative DNA Methylation and Gene Expression Analyses Identify DNA Packaging and Epigenetic Regulatory Genes Associated with Low Motility Sperm

BACKGROUND In previous studies using candidate gene approaches, low sperm count (oligospermia) has been associated with altered sperm mRNA content and DNA methylation in both imprinted and non-imprinted genes. We performed a genome-wide analysis of sperm DNA methylation and mRNA content to test for associations with sperm function. METHODS AND RESULTS Sperm DNA and mRNA were isolated from 21 ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Statistical applications in genetics and molecular biology

دوره 12 2  شماره 

صفحات  -

تاریخ انتشار 2013